-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clarify again support for "n-tiered" implementation #149
Conversation
I'll step out on a limb a bit and summarize Tom Wetmore's view: He sees the progression from individual source records to a final conclusion as a series of combinations of "Personas" (the notional persons reflected in a single document) into a set of conclusion "Persons" (each represented by the biographical sketch that is the output of Family History research). In order to model this series he argues that conclusion model Person class should support recursive aggregation (in other words, each Person except one derived directly from a source document) should have an 1->1..* aggregation of Person objects, each supported by a proof argument. |
I find this model somewhat attractive if a bit mechanistic. The principal advantage is the ease of dissociation: If one decides later that a particular upstream person is actually not the person one thinks he is, it's very easy to remove that upstream person and his whole set of upstream persons from the present conclusion without disrupting the other parts of one's conclusion and without having to rewrite from scratch a proof argument. The problem with that approach is that according to the Genealogical Proof Standard a good proof argument requires a synthesis of all of the collected evidence taken together, including conflicting evidence. If one decides that one set of John Smiths in the evidence of Fooville is not the John Smith that one is writing about, one must explain that, and why one believes it, in the proof argument. Simply dropping that set of records without explanation would give the impression that one hadn't conducted the required "reasonably exhaustive search". That said, there may be other benefits to an N-tiered recursive construct, and it isn't difficult to support -- a PrecursorPerson class with attributes Person, Researcher, and ArgumentString, with class Person having a 1->0..* association with PrecursorPerson will do the trick. The more interesting problem will be for applications which don't choose to support N-tiered to import GedcomX datasets which do. If we decide to support N-tiered we should at least outline a recommendation for how to go about that. |
+1
+1 I believe that an alternative approach is for a Person (Conclusion Model) to support a collection of Evidence, each Evidence containing a single link to the Persona (Record Model) which the researcher believes is (or is not) the Person being researched. |
Alright, @ttwetmore, I've started a wiki page to describe the n-tiered architecture and how it's supported. Would you be willing to fill in the relevant todo: sections for me? https://github.com/FamilySearch/gedcomx/wiki/The-N-Tiered-Genealogical-Architecture Once you fill that in, we can start hacking out how to support those concepts in GEDCOM-X until we get this right. |
@jralls and @EssyGreen, I think you've got some good points. Let's give @ttwetmore a chance to fill in his ideas so we can all make our own evaluations on how to best accommodate those ideas in a standard. |
@jralls summarizes my views quite well; thanks. I will write some on my own about places where there are some other shades of meaning when I get a little time. I'll try this afternoon. |
Sources have locations (where they are), descriptions (called metadata), and content (the evidence therein). We store the location and metadata in source records and source references. Where do we store the evidence? There are standard answers to this question, and there are variations:
For the remainder I assume that GEDCOMX goes with option 6 and includes evidence records, and more specifically, persona records. This is not a done deal, but for any discussion of the N-tier idea to make sense personas must exist. A model with personas also requires conclusion persons. A conclusion person is the classic person record of today’s genealogy programs. It is where all the information believed to be true about a real individual accumulates. It is the collection of facts that pertains to that that individual, and each fact can have its own source reference to provide provenance. When using persona records and conclusion persons, genealogical research proceeds by uncovering evidence, extracting personas from the evidence, concluding which sets of personas refer to the which individuals, and then building a conclusion person for each individual from those sets of personas. What happens to the personas during this process? There are two obvious answers. First, the personas can be consumed by the process, yielding up their facts to be added to the growing conclusion persons. Since each fact has the same source reference as its persona, the chain of evidence can be maintained at the fact level. After the facts are copied to the conclusion persons the personas are removed. There is a second answer, however, where the personas are not removed, and each conclusion person is linked to its own set of personas. In this scheme the conclusion persons inherit their facts from its personas, and in most cases the facts don’t need to be copied into the conclusion persons at all. There are advantages and disadvantages to both. The first approach can be implemented in every genealogy program around today. Each new persona is added to the program as a new person record, and when the user decides it describes the same individual as another person in the database, the user merges them together. The disadvantage of this approach is that after the records are merged it is hard to correct errors, undo merges, and keep track of the decision history that led to the building of that conclusion person. Things are reversed for the approach that keeps the personas. One disadvantage is that little software today supports the approach. It also has the disadvantages of requiring more records in the database and more complexity in its implementation. Its advantage is in keeping a persistent, complete record of all the evidence, a clear history of the research process, and a nearly painless solution to the undoing of decisions when errors or new evidence is discovered. I strongly believe that the advantages of the persistent persona approach outweighs both its own disadvantages and the advantages of the first approach. A system that supports grouping persistent personas into sets that are bound together by conclusion persons to represent individuals is a 2-tier system. The New Family Search application is a 2-tier system. The current GEDCOMX model, with its Record Model and Conclusion Model anticipates a 2-tier system. From a 2-tier model it is short step to an N-tier model. In an N-tier model, person records are joined together into tree structures, with the persons at the leaves being persona records extracted from evidence, the persons at the roots being the user’s current set of conclusion persons, and the interior persons representing intermediate decisions and conclusions made at different times during the research process. In a 2-tier system the possibly long history of decisions a user makes during the process of bringing a set of personas together, gets lost in the single proof statement for the conclusion person. In the N-tier system, the root person and all interior persons represent clean, direct and easily describable decision points during the research process. The structure of the tree is therefore isomorphic to and fully captures the research and decision processes. The overall conclusion at the root person level is the recursive recapitulation of the decisions made at each node during the construction of the person tree. The N-tier approach also makes the conclusion making process reversible, and because the decision making in a tree is only partially ordered (i.e., the same tree can be built by making certain conclusions in different orders), the tree can be decomposed in an order other than opposite the order of construction. Therefore decomposing N-tier person structures can lead to a different and more likely set of conclusion persons than existed at any time before in the research process. You can actually advance in your research process when undoing incorrect decisions. Ironically the recursive N-tier system is simpler than a 2-tier system. A single person record suffices all the way up, from the persona level to the root conclusion level. Simple recursion suffices throughout. |
Just a quick point on terminology ... I think when @ttwetmore's use of source and evidence may be different to mine.... @ttwetmore seems to indicate (forgive me if I'm wrong) that the data in the source is the evidence (a bit like Evidence=Source.Data). For me the data in a source is just data unless used in the context of trying to prove something else (a bit like Evidence=Source.Data+Research.Subject). I'm not saying either is right or wrong but it may help to have a clear Glossary of definitions documented with the model so we all refer to things in the same way. |
@ttwetmore - the purpose of this thread seems to be to focus on:
Whilst I get the strength of your beliefs in your post, I am not clear on what your requirements are. I assume that the only need is the ability to link multiple Persona records together in some sort of object together with a ProofStatement? Is that is what is being proposed/required? Or is there something else/different/more? Could you clarify for me in terms of the model rather than in terms of your particular application of the model? |
@EssyGreen |
@EssyGreen
I believe the best way to meet these requirements is a system in which the collected evidence about persons is recorded in persistent persona records, and in which nearly all conclusions are tied directly and persistently to the decisions the user makes when deciding which persona records refer to which individuals. When those decisions are made I believe they should be fully recorded, and the methods I have outlined for linking the persona records into a person tree structure is the method I believe best meets that need. The person tree structure is nothing more than an indelible representation and history of the decisions made during the research process. |
I'd like to add:
|
@jralls |
Yes that is true and I need the distinction in my model because the Evidence is an object in its own right ... but I digress ... my point was just to highlight the need for a common vocabulary.
So, what exactly is lacking in the current model which means that you cannot apply N-tier with it? |
I agree re the need to record searches and results (and also the goals - see #141 ) but the how and what details I think are worthy of a separate post. |
So, what exactly is lacking in the current model which means that you cannot apply N-tier with it? Almost nothing. We simply need a way for a person record to be able to refer to sub-person records. There are similar arguments to be made about evidence events, but I'd rather not muddy the waters. |
It's part of the administrative section. The GDM doesn't break the parts down into separate models, but the ERD does have some lines that rather arbitrarily divide it into "administrative", "evidence", and "conclusion" sections. At your request I've summarized the GDM in #138. Obviously I think that GedcomX needs to record searches. It's an element of the GPS, so it needs to be in the model. Sarah's right, though, I shouldn't be muddying up this issue with it. |
In this we agree. N-tier or non-N-tier can use this according to the needs of the user/application. |
Actually scratch that ... I should have read it more carefully .... I thought you meant a way for Person records to refer to Persona records (which I would agree with). If you mean Person->Person then I think the context needs to be specified or it is ambiguous (e.g. like old ALIAses). |
Actually scratch that ... I should have read it more carefully .... I thought you meant a way for Person records to refer to Persona records (which I would agree with). If you mean Person->Person then I think the context needs to be specified or it is ambiguous (e.g. like old ALIAses). I do mean Person->Person*, because in an N-tier there are no separate persona records -- the person records at the leaves are logical personas and that's it. But this is a specific 1-to-n relationship so it has its own semantics and label. I don't have any clever label for it -- I've been calling it "subPerson" until someone thinks of something better. If anyone thought there should be another kind of 1-n relationship between persons, that relationships would have to have its own name and semantics. |
Yes I got that (albeit the second time round hehe). It strikes me that this is the same thing as the old GEDCOM ALIAs (if not can you explain how it differs?) If so, then personally I would not use it because I think it adds unnecessary complexity (I have never yet seen an application that handles ALIAses well - and I can understand why) and it can be more clearly exemplified by creating new trees/files for the different possibilities which can then be used/referenced as sources if/when a conclusion is reached in the original file. Since with GEDCOMX we will now be able to handle recursive sources I don't see why we should add the complexity into the base model. If it is included as the standard then applications will either be forced to implement N-tier in the way you have specified or to reject any data formatted in this way (resulting in loss of data). If on the other hand it is not included in the base model, then applications wishing to implement N-tier in the way you specify can simply merge the Persons in the different trees/files using the ALIAs or equivalent. (ie upon finding a Person with a Persona which comes from a Source which is a GEDCOMX file then you can lookup the Persons which this Person represents in the Source file, add them as Persons to your file and link via the ALIAs) For that reason I would prefer it not to be implemented in the model. |
@EssyGreen ALIA An indicator to link different record descriptions of a person who may be the same person. The ALIA tag could be used to implement the N-tier structure. Of course GEDCOM is supposed to only hold conclusion persons, and the semantics of ALIA are supposed to mean "may be the same conclusion person." So though ALIA does provide a person->person* mapping, it has 1) an entirely different semantics; and 2) it has never been treated meaningfully in current software so has never been implemented well. So to use the the existence of the alias concept as a reason to reject the N-tier approach is a non-starter. However, to argue about what requirements the N-tier approach would place on a genealogy software program does make sense. If a program that can't handle the N-tier approach were to import GEDCOMX data with N-tier structures, that software would be in a quandary on how to proceed. One thing for sure, the program couldn't claim to be GEDCOMX compatible. How big a concern must this be for GEDCOMX in defining a new model? For me the purpose of the N-tier structure is to allow full support of the research process (minus, as has been pointed out by @jralls, the administrative part). My main criticism of the current generation of desktop systems is that they don't provide the features that allow this support. I have always characterized the current set of desktop systems as conclusion only without providing enough research support. So my concern over whether the current set of programs could handle N-tier data is not high. I don't think any could support it without adding features. In the development of new standards how much concern should be placed on trying to model an area of human discourse in a relative vacuum of what's gone on before, versus how much concern should be placed on the precedences that have been set by the current set of standards, or on how difficult it will be for vendors to implement a new standard? These are tough questions that GEDCOMX will have to answer. My answers to questions like these are always to do what from the technical or scientific or idealistic viewpoints seems the best. I would add the N-tier structure with the attitude that vendors adjust or fade away; that it's good for them even if they kick and scream. Maybe an extreme view that should be rejected out of hand. |
This is only true if GEDCOMX implements N-tier in the way you have specified.
Maybe, but it is not the only way that the research process can be fully supported - indeed by your own admission you do not deem the goal/search areas to be important(which others might). My concern is that N-tier (as you have specified it) is an overly complex model which many applications (and users) may have difficulty with. |
That depends on who holds the majority vote. If some "non-standard" application(s) are seen to support the research process better than those who simply uphold GEDCOMX then it could be GEDCOMX that fades away. In the current market, the applications which are in my opinion more useful only partially support the existing GEDCOM. Conversely, I know of at least one who rigidly stands by GEDCOM 5.5.0 and is (in my opinion) stifling itself by so doing. GEDCOMX must be flexible enough to support a multitude of applications not just a single way of doing things. |
@EssyGreen I view the addition of the goal/search stuff as such an easy thing to do that I don't worry about it -- someone who cares about it should decide what to add to GEDCOMX to support it and GEDCOMX should do it. I have freely admitted (it was among the disadvantages highlighted in my missive) that the N-tier model is complex (I would not go so far as to say overly complex as you have). It will be a challenge to implement for developers, but not so hard for experienced developers. A good implementation would go a long way to making it accessible for users. A few excellent user interface metaphors (e.g., moving index cards around on a desk top) could go a long way in helping a user -- simply provide them with a user interface that mimics the paper and pencil approach they now use. All that said, I agree with you fully that it is a complex idea with difficulties in its implementation and use. Does its advantages outweigh these disadvantages? Many say no, no, no, no, no. That's life in the high technology fast lane. |
That depends on who holds the majority vote. If some "non-standard" application(s) are seen to support the research process better than those who simply uphold GEDCOMX then it could be GEDCOMX that fades away. This is a dirty little secret that standards writers don’t talk about. If a killer app shows up that sweeps the stakes, its underlying model will become the defacto standard no matter what. My best approach, if GEDCOMX decides that the N-tier approach is too complex, would be to write that killer app;) GEDCOMX must be flexible enough to support a multitude of applications not just a single way of doing things. Of course, but there are some issues lurking. The big one, of course, is what does an application do when a GEDCOMX import file uses features that the application does not support? I’m sure you could come up with examples. Mine would center around the N-tier stuff, since that is a feature that would probably be implemented last if at all by some applications. |
They won't scream, they'll ignore, and GedcomX will be stillborn. |
They won't scream, they'll ignore, and GedcomX will be stillborn. You may be right. |
I'm sorry if you took my comments as criticism. The were not intended as criticisms of you or your model but simply my opinions of the implications of applying your particular implementation of N-tier in the context of GEDCOMX. Re your specific issues:
Agreed. I should have phrased it as "I believe because of the issues below that there will be a significant number of applications not willing/able to apply your particular implementation of N-tier and in migrating away from an N-tier to a non-N-tier application there will therefore be data loss."
OK so you now have an ALIAs object rather than just a link - presumably with evidence links as well as a proof statement ... this was a misunderstanding on my part and is an improvement from my perspective. However, I still maintain that the directional constraints (based on the order of discovery) which you outlined earlier ("the order of matching/linking/combining/merging") will need to be clarified in order to ascertain how/if the links should be validated:
Re:
I don't know what NFS is so can't follow your argument here (except by guesswork) but I agree that the UI is the key to making it clear .... hence my next point about the difficulty you are imposing on the developers in creating a good UI.
I think your comments here are the ones that are unfair! My comments do not assume anything of the sort!!! I have always found that the best way to develop a model is to base it on the real world. This makes it easier to develop and more intuitive when explaining concepts to users. I don't believe I am alone in this "misconception". To reinterate my main point:
I can see that your model would fit very neatly in the world of what I call social-networking genealogy (e.g. MyHeritage, GenesReUnited). You may also be right in that some genealogical research applications may be able to utilise your model in exactly the way you want it implemented. However, I personally think that it is a specific implementation which will not fit well in many other genealogical research applications and hence should not be considered core to the Conclusion Model. |
I think I prefer higher-tier or higher-level to upper. At ZoomInfo we called the very top level "individuals" and everything else persons or personas. Sorry I can't be much more help than that. |
Hi all. RootsTech is over. Back to work. Your comments are invited on the latest provisions to this issue. I'd like you to especially comment on a113ea8 which extracted out the n-tier architecture provisions into a separate specification. As I added some clarification to the documentation and to the examples, it became pretty clear that we needed a distinct place for this concept to be specified. This will allow us to specify more details about the n-tier architecture (such as how to resolve conflicts) in a distinct place where these principles can evolve separately from the underlying conceptual model.
See 9b8f2ba.
See 3a67f4f.
I'm listening. I couldn't think of any use case that would justify the cost of explaining what those things are. |
Hmm. OK. I guess. There's not much to like about the element name, and ISTM 2-tier will cause most software to choke, too. But see below.
OK, rereading in context instead of from the changeset makes it a little clearer (formatting helps ;-) ). Maybe the extensive discussion of "Persona Constraints" belongs in the n-tier document. The description in the conceptual spec could be something like "flag for layered-person evidence architectures" and mention the n-tier spec as the place to look for more information. A weak reference rather than a hard dependency. I'm starting to see 2-tier as a special case of n-tier rather than an extension of 0-tier, because I think that most current software isn't going to be able to handle any layering. That suggests that your "complianceRequirement" element should apply to both -- but complianceRequirement isn't a very good name. Why not 'evidenceArchitecture" with values "traditional", "2-tier", or 'n-tier"? Even a program capable of handling all three would benefit from being told up front which kind to expect, particularly if it doesn't want to construct the DOM tree in memory. |
I disagree because I think the notion of a "persona" is universal and distinct from evidence architectures. For example, there are providers who are only interested in providing personas (think exchanging data from the 1940 U.S. Census Project) but aren't concerned how those personas are used to build up an evidence architecture.
Agreed.
Okay. How about just "constraint"?
But I wanted to keep the notion of compliance requirements flexible enough to handle more than just evidence architectures. I can imagine other data profiles that applications may want to force compliance with, to. |
I'm not too fond of the spec allowing too many variations. I mentioned it briefly as we discussed PlaceDescriptions, but in general I think a spec should be very strict regarding the ability to express the same data in multiple ways. The less ways possible, the better. To me, 2-tier seems an artificial limitation of n-tier that will only annoy me in daily use. If possible, I'd get rid of traditional entirely, but I recognize that it might make migrations easier. The descriptions of n-tier in the new document look reasonable. |
I don't wish to try to read Ryan's mind, but I do wish to say something. I see the developing GEDCOM-X standard as being able to support different models/philosophies for genealogical data. First, conventional conclusion based genealogy where we only record summaries of information and their sources, where the person records are bags of facts, each fact taken from a possibly different source, and each fact, if the user is not too lazy, having a reference to its particular source record. Call this 1-tier. Or call this the underlying model of 99% of all current desktop and online genealogical applications. But Ryan also wants the model to be useful in systems where there are distinct conclusion and evidence layers, where the conclusion persons are not bags of facts, but instead are bags of references to a lower tier of person records that contain facts; and each lower level person is restricted to containing only the facts extracted from exactly one source. Call this 2-tier. New Family Search trees had this to some extent. And what the heck then. Also allow GEDCOM-X to be used in more complex models. Thus N-tier. Having N-tiers allows each node to represent a decision (that all lower persons are the same real person), which in turn allows complex decision trees to be easily managed. I love it. But how many years will go by before any system truly supports it? I am just wonderfully gratified to know that GEDCOM-X is willing to anticipate that use. GEDCOM-X does not say anything about the models that should be used. GEDCOM-X simply enables developers to choose whatever model they believe is best for their application. I think of GEDCOM-X as specifying capabilities, but not the requirements on how to use the capabilities. What is so wonderful about the whole thing, in my opinion, that in order to support 2-tier and N-tier, all a data model has to do is have a way for any person record to be able to refer to a list of other person records. There is some discussion that you also need a tag in order to specify the kind of person record. Personally I don't think this is that important, but it's such an insignificant thing, it's hard to argue about it. If GEDCOM-X, because it supports N-tier, tried to specify that systems using it had to support N-tier, GEDCOM-X would go down in dust. GEDCOM-X's job is simply to be there for whatever genealogical model a developer wants to implement. It's a meta-model at the capability level. This all begs the question of how easy or difficult it will be to transfer data between different genealogical systems using GEDCOM-X as the transport mechanism. Think about the problems of transferring data from a 1-tier system to an N-tier system, and vice versa. There are some real issues in here. Are those issues big enough to declare that the goals of GEDCOM-X being capability based rather than requirements based are ill conceived. Speaking only for myself, I don't think so. |
Ah, the "record model" rears its head again. ;-) Anyway, I said "extensive discussion", not the flag itself. The discussion is largely about connecting persona-constrained Persons into not-persona-constrained Persons. That action creates a 2+-tier Persona architecture. The conceptual spec can just provide a definition of "persona" and say that the persona constraint tells readers that this Person record fits the description.
It's not really a constraint on the whole GedcomX document, is it? Even the fact that a document contains persona-constrained Person instances shouldn't be a big deal to a 0-tier application: That's what such an application is going to expect from an item of evidence, even if the devs don't know to call it that. What's going to cause trouble is reading a GedcomX document that links (or, as Tom points out, doesn't) Person instances into a hierarchy.
Can you articulate any of them? What if instead of calling it an "evidence architecture" you call it a "person-conclusion architecture", which more accurately describes what the introduction of personas does? |
Unfortunately that would mean that n-tier must be excluded. Otherwise the only application to use it will be FS FamilyTree (assuming that FSFT is moving towards n-tier). |
Tom,
That's the rub, isn't it? At least a 0-tier program could flatten the tiers into a single Person instance. Teasing a flat person back out into tiers would be pretty difficult.
I don't understand what you mean. "goals of GEDCOM-X" seems rather broad. |
If this is about "profiles", can we just call it a profile? <gedcomx>
<profile resource="http://gedcomx.org/n-tier-evidence-architecture/v1"/>
<!-- rest of the data goes here -->
</gedcomx> |
"Profile" is at least nicer than "complianceRequirement"! But what's a "profile"? |
Thanks.
A similar discussion is recent at rootsdev! 1-tier systems are called C-systems and 2-tier systems are called E-systems (you can probably figure out what the letters mean). The actual discussion is about how to collaborate between systems of the two types, meaning that the users want to share their data back and forth many times.
What I would like the goals of GEDCOM-X to be -- the archival and transport format for genealogical data that includes all source, evidence, and conclusion information, where the data is digitally transcribed and structured and not simply narrative; attached images are fine. |
Seems rather more ambitious than what the users keep telling us at conferences that they want, which is the ability to losslessly transfer their data from program a to program b. I'm not unsympathetic to the ideal, but until someone actually writes a program that accomplishes the goal, writing a spec for interchange of that sort of data seems to be putting the cart before the horse. I don't think that it's yet possible to "digitally transcribe and structure" genealogical (or any other complex) reasoning. Well, transcribing is easy enough, but what does "transcribing without narrative" mean? As Thad points out, an n-tier Person/Persona architecture doesn't come close to covering the complexity of high-quality genealogical reasoning: It actually gets in the way because it emphasizes direct evidence over indirect evidence. |
@ttwetmore I think you did a great job of reading my mind. And you articulated it beautifully. Thanks.
Indeed. Good point.
Uh... yeah. That's what the attached changes are proposing, no?
Sure. But I'm afraid the ones I can come up with in the time I'm allotting myself might sound a bit contrived:
I like it. |
Not when they go on about linking persona-constrained-Persons into not-constrained-Persons. Then they're about n-tier and they belong in the n-tier document.
Contrived indeed. I don't think that's the sort of thing that we want to encourage: The paradigm for data exchange protocols is to write strictly and read liberally; this encourages the opposite. Client applications should be encouraged to read all GedcomX files, accepting what data they can use and issuing warnings for items they can't. |
Agreed. But I'm not sure what you're referring to in the conceptual model document. The only place that "goes on about linking persona-constrained-Persons into not-constrained-Persons" is in the identifier examples, and I don't think that's specific to n-tier at all, I think that's a generally-applicable.
Agreed, but I think the right place to encourage/discourage these kinds of things is in the marketplace. I think this sort of feature provides a means whereby new and innovative ideas, constraints, and profiles (like n-tier) can be proven out and either thrive or die based on their success. |
I don't like the idea of profiles at all. That seems to open a can of worms where everyone can define incompatible profiles that could potentially completely break interchange. |
So guys, I understand the visceral reaction to the proposal. I really do. But I'd like you to set aside the "gut reaction" for just a minute and consider the following:
I guess I feel that providing a means to declare data profiles doesn't end up harming the marketplace at all, but in fact enables it to grow and develop and mature by providing a way to propose and vet new and innovative ideas. |
Yeah, the experience with GEDCOM really supports that claim. Right. Providers sell software based on feature lists. They know their users want them to be able to exchange data, but they also want to lock users in to their upgrade cycle (or their website's ads). The solution? They provide the feature that users demand but they cripple it so that it doesn't actually do what the users want. |
Rubbish. A GedcomX parser can easily figure out from the data whether its getting a 0-tier or an n-tier file and deal with it or bail out. Having a flag at the beginning might make it easier. It doesn't really contribute anything, but if restricted to that one thing is relatively harmless. The problem is that you want to generalize it and allow anyone to invent "constraint" flags. That's just giving the vendors another opportunity to have GedcomX on their bullet lists but make it non-functional for exchanging datasets. |
Rubbish again. There's no barrier to using whatever data model one likes when writing a Genealogy program. There is a barrier to writing an importer for a format that supports more than one data model: One must have an importer for every supported data model. If the format supports any arbitrary data model the task becomes impossible and the format is useless. |
Picking up this thread where it last got left, I will not be pursuing my suggestion to support the notion of "data profiles". Even so, I want to make it clear that this is not because I share @jralls sardonic analysis of genealogical data providers. I refuse to attribute to malice what can rightfully be attributed to toolset inadequcies or even just ignorance. |
With the close of issues #242, #244, and #246 I believe the n-tier model to be supported. To illustrate, I'm going to lift @mikkelee's example at #232. Note that I'm not mentioning things like source references, analysis documents, etc. for the sake of simplicity.
I am quite certain that Johan G. in A + B is the same person. A I am also quite certain that Johan F. in C + D is the same person. A I come across another (hypothetical) source that talked about person E, and that I use to conclude that A/B is the same as C/D because he used his father's first name as a patronymic. Two-Tier ImplementationsThe two-tier implementation would end up with a N-Tier ImplementationsThe n-tier implementation would end up with a Your comments are welcome. I expect to close out this issue early next week. Thank you for all of your contributions. |
+1 With the caveat that we may want to revisit this after publication of Robert Charles Anderson's The Elements of Genealogical Analysis. I attended his preview lecture on Saturday and immediately recognized that it's about n-tier. Not surprising, since that grew out of the Gentech model of which RCA was a principal author. |
I worry that data gets lost when aggregating persons from n-tier to 2-tier. Though I suppose that's up to the implementer to ensure won't happen. +1 on N-tier, from what I can tell the current spec will knead my suits :) |
No worries. There aren't any implementations of either. ;-) |
We started on this with #72, but things have changed a bit. What we need are some recipes.
See #138 for discussion on this point.